Experience with a Combined Approach to Attribute-Matching Across Heterogeneous Databases

نویسندگان

  • Chris Clifton
  • E. Housman
  • Arnon Rosenthal
چکیده

Determining attribute correspondences is a difficult, time-consuming, knowledgeintensive part of database integration. We report on experiences with tools that identified candidate correspondences, as a step in a large scale effort to improve communication among Air Force systems. First, we describe a new method that was both simple and surprisingly successful: Data dictionary and catalog information were dumped to unformatted text; then off-the-shelf information retrieval software estimated string similarity, generated candidate matches, and provided the interface. The second method used a different set of clues, such as statistics on database populations, to compute separate similarity metrics (using neural network techniques). We report on substantial use of the first tool, and then report some limited initial experiments that examine the two techniques’ accuracy, consistency and complementarity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HeteroClass: A Framework for Effective Classification from Heterogeneous Databases

Classification is an important data mining task and it has been studied from different perspectives. Recently multi-relational classification algorithms has been studied due to many real-world applications. However, current work has generally assumed that all the needed data to build an accurate prediction model resides in a single database. Many practical settings, however, require that we com...

متن کامل

SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks

One step in interoperating among heterogeneous databases is semantic integration: Identifying relationships between attributes or classes in di€erent database schemas. SEMantic INTegrator (SEMINT) is a tool based on neural networks to assist in identifying attribute correspondences in heterogeneous databases. SEMINT supports access to a variety of database systems and utilizes both schema infor...

متن کامل

An Integrated Geophysical Approach for Porosity and Facies Determination: A Case Study of Tamag Field of Niger Delta Hydrocarbon Province

Petro physics, rock physics and multi-attribute analysis have been employed in an integrated approach to delineate porosity variation across Tamag Field of Niger Delta Basin. Gamma and resistivity logs were employed to identify sand bodies and correlated across the field. Petro physical analysis was undertaken. Rock physics modelling and multi-attribute analysis were carried out. Two hydrocarbo...

متن کامل

Combining Multiple Query Interface Matchers Using Dempster-Shafer Theory of Evidence

Matching query interfaces is a crucial step in data integration across multiple Web databases. The problem is closely related to schema matching that typically exploits different features of schemas. Relying on a particular feature of schemas is not sufficient. We propose an evidential approach to combining multiple matchers using Dempster-Shafer theory of evidence. First, our approach views th...

متن کامل

Context-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network

Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997